Testing the relevance of speech rate, pitch and a glottal Chink for the perception of age in synthesized speech using formant synthesis
نویسنده
چکیده
Listeners are able to rate a speaker’s age with reasonable accuracy. However, it is still controversial which features reliably signal a speaker’s age. This paper presents results of a synthesis study, where speech rate, pitch, and a glottal chink were varied systematically over a range that effectively occurs in natural speech to shift the mean perceived age. The strongest impact on age judgements was found for (i) speech rate, followed by (ii) the glottal chink, while the impact of pitch was only marginal. Some interactions (iii) between the parameters were observed as well. Results regarding (i) and (ii) show, that formant synthesis is capable of producing speech considerably varying in its mean perceived age even if only a small number of features are manipulated. Regarding (iii), results indicate, that in the study of the impact of selected features their interactions should be considered too.
منابع مشابه
Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language
Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...
متن کاملImproving Quality of Speech Synthesis in Indian Languages
Harmonic plus noise model (HNM) which divides the speech signal in two sub bands: harmonic and noise, is implemented with the objective of studying its capabilities for improving the quality of speech synthesis in Indian languages. Investigations show that HNM is capable of synthesizing all vowels and syllables with good quality. All the syllables are intelligible if synthesized using only harm...
متن کاملTowards synthesis of speaker age: A perceptual study with natural, synthesized and resynthesized stimuli
As a first step towards synthesis of speaker age the hypothesis that spectral cues may be more important for age perception than F0 and duration was tested in a pilot listening experiment with male speaker stimuli consisting of natural, synthesized and resynthesized isolated words. Results indicate that spectral information is dominant over pitch as cues for age. Slow speech rate also seems to ...
متن کاملThe Study of Vowel Space and Formant Structure in Mazani Language
Objective: One of the parameters showing the correct phonetic and phonological development is the correct and clear articulation of vowels is achieved by changing the shape of vocal cords through altering the height and position of the tongue and the movement of the lips and jaw. The tongue’s height and position are the basis of the production and difference of vowels. In other words, the raw s...
متن کاملEffect of Spectral Tilt Parameter of Excitation Signal on the Synthesis of Vowel /a/
The nature of the vocal-cord excitation has long interested speech researcher. One of the most important problems has been the specification of source excitations for speech synthesizers. It is known that the shape and periodicity of the vocal-cord excitation are subject to large variations. Investigations were carried out the effects of various glottal wave parameters on the synthesis of speec...
متن کامل